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Abstract. We find the weak rate of convergence of approximate solutions of the nonlinear 
stochastic heat equation, when discretizcd in space by a standard finite clement method. Both 
multiplicative and additive noise is considered under different assumptions. 

This extends an earlier result of Debussche in which time discretization is considered for 
the stochastic heat equation perturbed by white noise. It is known that this equation only 
has a solution in one space dimension. In order to get results for higher dimensions, colored 
noise is considered here, besides the white noise case where considerably weaker assumptions 
on the noise term is needed. Integration by parts in the Malliavin sense is used in the proof. 
The rate of weak convergence is, as expected, essentially twice the rate of strong convergence. 



1. Introduction and main result 

Let V C R d be a bounded, convex and polygonal domain. We consider, for T > 0, the sto- 
chastic heat equation with Dirichlet boundary condition, written in abstract form as a stochastic 
evolution equation in H = L^iT)): 

(1.1) dX{t) + [AX(t)-f(X(t))]dt = g(X(t))dW(t), te (0,T] ; X(0)=X . 

This equation is driven by a cylindrical Q- Wiener process (W^(i))te[o,T] m a filtered probability 
space (fl, J 7 , (Jt)(E[o,T]iP)- The covariancc operator Q is sclfadjoint and positive semidefinite, 
not necessarily of finite trace. For technical reasons we consider a deterministic initial value 

x eH. 

The leading linear operator A is, for simplicity, taken to be —A with domain dom(A) — 
H 2 (V) n Hq(T>), where A = Ylk=i ® 2 /® x k ^ s the Laplace operator. It is well known that —A 
generates an analytic semigroup of bounded linear operators on H. We denote it by (E(t)) t >o. 
The spaces H 13 = dom(A%), defined by fractional powers of A, are used to measure the spatial 
regularity. We denote the norm and inner product in H = L 2 {T>) by || • || and (•, •). 

Let U, V be separable Hilbert spaces and let £{U, V) denote the Banach space of all linear 
bounded operators. We denote by £i(U, V) C C<2,(U,V) C C(U,V) the subspaces consisting 
of trace class operators and Hilbert-Schmidt operators, respectively. We use the abbreviations 
£{U) = £(U,U), £ = £{H) when H = L^iV), and similarly for £ p , p — 1,2. Central in the 
theory of stochastic integration is the space Uq = Q X / 2 (H). We write £\ = £^{Uo,H). By 
C*(U, V) we denote the space of not necessarily bounded functions from a Banach space U to a 
Banach space V that have continuous and bounded Frechet derivatives of orders 1, . . . , k. For 
more precise definitions, see Section [2] below. 

We use a "regularity parameter" f3 such that ([A^ - \\ c a = j| A~^~Q^ \\c 2 < oo. If Q = /, then 

||A^2~ \\ c o = \\A \\c 2 < °°; if an d oni y if d = 1 and ft < 1, see (|2.9p . We consider two sets of 
assumptions according to the type of noise term. 

A. Additive noise in multiple dimensions. Assume that / G C 2 (H,H), g(x) = I for all 
xEH, and H-A^^l^o = IjA^-Q^ \\c 2 < oo for some /3 £ [|,1]- 
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B. Multiplicative noise in one dimension. Assume that / 6 C^(H, H), g(x) = B+Cx+g(x), 
where B e C, C e C(H, £), and g £ C%(H-*,£). Moreover, assume that d = 1, Q = /, 
and select any /3 G (0, ^). 

Under either of these assumptions we have a unique mild solution to satisfying the 

stochastic fixed point equation 

(1.2) X(t) = £(t)X + [ E(t- s)f(X(s)) ds + [ E(t- s)g{X(s)) dW(s), t e [0, T]. 

Jo Jo 

One can also show that the solution has spatial regularity of order /3, i.e., it is of the form 
X : [0,T] x fl — >• ij* 3 , P-almost surely, see Theorem 12.31 below and the discussion preceding it. 

In this paper we consider space discretization of equation (jl.ip by means of a standard 
finite element method. Let (Sh)h£(o,i) be the family of spaces of continuous piecewise linear 
functions corresponding to a quasi- uniform family of triangulations of T> with Sh C Hq(V). The 
parameter h specifies the maximal diameter in the triangulation. Let Ph : H — > Sh denote the 
orthogonal projection. We define the discrete Laplacian as the operator Ah : Sh — > Sh satisfying 
the variational equality 

(1-3) (A h ^, x ) = (W>,V X ), W, x eS h . 

The finite element approximation of the elliptic problem Au = f is the unique solution of the 
equation A^Uh = Phf- It is known that \\uh — u\\ = WA^PyJ — A^ 1 f\\ = 0{h?) as h — >• 0, if 
/ 6 L 2 (V). The semigroup generated by —Ah is denoted (Eh{t)) t >o- The spatially semidiscretc 
analogue of (jl.lj) is to find a process (Xh(t)) t £(o.T] with values in Sh such that 

(1.4) dX h (t) + [A h X h (t) - P h f(X(t))] dt = P h g{X h (t))dW(t), t e (0,T] ; X h (0) = P h X , 

or in mild form 

X h (t) = E h (t)P h X + f E h (t - s)P h f{X h (s)) ds 

(1-5) , J ° 

+ E h (t-s)P h g(X h (s))dW(s), i€[0,T]. 
Jo 

The existence of a unique mild solution can be proved in a similar way as for (|1.2[) . It is also 
known that we have strong convergence of order f3 under Assumptions A or B, see (|2.26[) . Our 
goal is to prove weak convergence in the form 

E[G(A(T)) - G(X h {T))\ = 0{h 2 ?-% 

for any e > and any testfunction GeC^. 

For an exhaustive list of references for approximations of stochastic partial differential equa- 
tions, see, e.g., [5]. We mention some works related to the situation studied here. Weak conver- 
gence of numerical schemes for linear equations with additive noise is treated in [6], [14], [13], 
and [H]. In the first paper full discretization of the stochastic heat equation is considered for 
colored noise in multiple dimension, i.e., our Assumption A with / = 0. Papers [14) and [13] 
deal with semidiscretization in space and full discretization, respectively, for the linear stochas- 
tic heat, Cahn-Hilliard, and wave equations, also with additive colored noise. The fourth paper 
provides an extension to impulsive noise. 

The only results on weak convergence for nonlinear equations are those of [9] , [10] , [4] , [5] , [I] 
and [21] . In the work [5] , discretization in time with implicit Eulcr and Crank-Nicolson schemes 
is considered for semilinear parabolic equations with additive noise. Paper |10j treats the wave 
equation with additive white noise, discretized by a leap-frog scheme. This case is a bit different 
from the others, due to the lack of analyticity of the semigroup for the wave equation in contrast 
to the heat equation. In [4] semidiscretization in time for the nonlinear stochastic Schrodinger 
equation with multiplicative white noise is considered. 

The papers [I], [B], [H], [T3], and [PS] express the weak error by means of a Kolmogorov 
equation after removing the linear term AX(t) by a transformation of variables. This trans- 
formation does not work for the nonlinear heat equation. This difficulty is handled in [S] by 
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means of an integration by parts from the Malliavin calculus. This paper proves weak conver- 
gence of temporal semidiscretizations for the nonlinear heat equation with multiplicative noise 
in one space dimension, i.e, our Assumption B. Under the same assumptions, except for an extra 
boundedness condition on the nonlinearity, in [I] the method of [5] is exploited to prove weak 
convergence for the invariant measure of temporally discrete approximations. In |24| the same 
proof technique is used to study time discretization for the heat equation with additive noise in 
multiple dimensions, i.e., our Assumption A. 

In the present paper we extend the results of [24] and [5] to spatial discretization. Our 
Assumptions A and B coincide with the ones in these two papers, respectively. Therefore we 
may quote some moment estimates from these papers. One difficulty that arises in connection 
with the spatial discretization is that the projector Ph does not commute with the projector 
onto eigenspaces of A. 

In all these works the rate of weak convergence is, up to an arbitrary e > 0, twice that of 
strong convergence. The Malliavin calculus is a useful tool in the study of weak convergence of 
semilinear equations. It has been utilized in [9], [5], and |15| in completely different ways. It 
plays a central role in the proof of our Theorem 11.11 following the method of [5] . In the papers 
[T] and [53] the technique of [5] is also used. 

The result of this paper actually concerns the convergence of the law C(Xh(T)) = P o 
(XhlT))^ 1 of the random variables (Aft,(T))^ g ( ,i)j as the mesh size parameter h — > 0. We say 
that the law of X h (T) converges weakly to that of X(T), if E[G(A /l (T))] -> E[G(A(T))] as 
h — > 0, for all test functions G G Ch(H, R), the space of all bounded continuous functions on H . 
This convergence follows from the strong convergence E[||A/ t (T) — A(T)|| 2 ] = see [16] 

and the discussion below, and the weak rate obtained is thus j3 under mild assumptions. For 
G G C^(H, R), we obtain in this paper the rate of weak convergence 2/3 — e, for an arbitrary 
e > 0. 

Theorem 1.1. Assume either Assumption A or Assumption B and let X and Xh be the solu- 
tions of the equations (jl.2[) and (|1.5[) . respectively. Then, for every test function G G C^(H, R) 
and 7 G [0,/3), we have the convergence 

\E[G(X(T)) - G(X h (T))}\ = 0(h 2 ^), as h -> 0. 

The weak error is interesting by various reasons. It measures the error made by sampling 
from an approximate probability law of X(T), rather than the deviation from the trajectory 
of an exact solution, as for the strong error. The result tells us that the weak error, when 
approximating the quantity E[G(A(T))] by E[G(A/j(T))], is decreasing fast as h — > for smooth 
G. 

Section[5]is devoted to preliminaries. In Subsection l2.1l compact operators and tensor products 
are introduced. We need Schatten classes more general than the trace class and Hilbert-Schmidt 
operators. In Subsection 12.21 some notation for Frechet derivatives is fixed. The semigroup 
framework and basic material on the finite element method are presented in Subsection 12.31 In 
Subsection 12.41 the Malliavin calculus and stochastic integration is introduced. Subsection 12.51 
is about the stochastic equations (|1.2j) and (|1.5[) . In Section [3] two moment estimates for the 
Malliavin derivative of Xh (t) arc proved. Section[4]contains regularity results for the Kolmogorov 
equation, adapting results from [5] and [24] to our setting. The proof of Theorem 11.11 is given 
in Section 



2. PRELIMINARIES 

2.1. Compact operators and tensor products. Given two separable real Hilbert spaces 
(U, (-,')u) an d (V, (v)v)i ^ £(U,V) denote the Banach space of all bounded and linear 
operators U — > V endowed with the uniform norm. We write C(U) = C(U 7 U). Let (<7i)i e x be 
the collection of singular values of a compact operator T G C(U). These are the eigenvalues of 
the operator \T\ — (TT*) 1 / 2 . The index set I is finite or countable. Let, for 1 < p < oo, the 
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Schatten class C p = C P (U) be all T G C(U) for which 
(2-1) \\T\\c p =(J2° 



< 00. 



We set by definition = C. The Schatten classes are Banach spaces equipped with the 
norms (|2.1[) . The class L\ is the space of trace class operators. Take an arbitrary ON-basis 
(e n )„g\ C U . We define the trace of an operator T G Ci(U) as the quantity 

It is independent of the particular choice of ON-basis. If T G C x and T > 0, then Tr(T) = \\T\\ Cl . 
In general, the relation 

(2-2) |Tr(T)|<||T|| A 

holds for T G C\. It follows directly from the definition that Tr(T) = Tr(T*) for T G C\. 
Moreover, 

(2.3) Tr(ST) = Tr(TS), 

whenever S G C(U, V) and T G £(V, t/) satisfies ST G £i(V) and TS G £i(£/). 

More generally, the class C^iJJ, V) is the space of Hilbert-Schmidt operators from U to V. It 
is defined as the Hilbcrt space with the scalar product and norm 

(2.4) (S,T)c 2{U , V ) = J2( Se ^ Te i)v = Tr(T*S) = Tr(ST*), 

(2.5) \\T\\ C2 (u,v) = (E ii^iivT = V^rn- 

The choice of ON-basis (e„)„ e N C U is arbitrary. For [/ = V the class £2 = £-2{U) is alone to 
enjoy this property. For C p with 2, only an eigenbasis of \T\ can be used. 
The following Holder type inequality for Schatten classes holds: 

(2-6) \\ST\\ Cr <\\S\\c p \\T\\c q , 

for r^ 1 = p^ 1 + <7 _1 , p, q, r G [1, 00]. The border case 

(2-7) \\ST\\ Cr < \\S\\ c \\T\\ £r 

is included, meaning that C r (U) is an ideal of the Banach algebra C(U). Also 

(2-8) \(S,T) C2 \ = |Tr(ST*)| < \\ST*\\ Cl < ||S|| £ ||T|| £l . 

For more about the Schatten classes see [7]. 

The tensor product space U <S> V of two Hilbert spaces U and V is a Hilbert space together 
with a bilinear mapping U X V — >• U <g> V, (it, v) >— ► t( <8> v with dense range and with the inner 
product 

(til ®vi,u 2 ® v 2 )u®v — {ui,U2}u(vi,v 2 }v, Ui,U 2 €U, v ly v 2 eV. 

If (u„)„ e N C U and (u„)„eN C are ON-bases, then (tt TO (8>w„) m ,„gN C U ® V is an ON-basis. 
The space U ® V can be realized in several isomorphic ways. If the tensor product u®v realizes 
a rank one operator (u <g) v)<j> = (v, (f>)vu, for G V, then U <8> V = £ 2 (V, U). If U and V are 
spaces of functions of independent variables 16P1 and y £ T> 2 , then (tt®w)(a;, y) = u{x)v{y) is 
also a realization of C/ (g) V. For instance, if U = L 2 (D) and V = L 2 (fl), where T> is our spatial 
domain and ft the sample space, then U <8> V = L 2 (f2 x £>) = L 2 (S1, L 2 (T>)), i.e., L 2 (P)-valued 
square intcgrable random variables. For a detailed introduction to tensor products, see [111 
Appendix E]. 
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2.2. Frechet derivatives. Let (17, || • \\u) and (V, \\ ■ \\ v ) be Banach spaces. By C^(U,V) we 
denote the space of not necessarily bounded mappings g: U —> V having m continuous and 
bounded Frechet derivatives Dg,D 2 g, . . . ,D m g. We endow it with the seminorm | • \c m (u,V)> 
determined as the smallest constant C > such that 

sup \\D m g{x) 4> m )\\ v <C^i||y-||<M|£/, V<f>i, ■ ■ ■ ,4>m € U. 

It will be convenient to write C™ = C™(U, V). From the context it will be clear what we mean. 

Let us consider the important case when U is a Hilbert space and V = R. The Frechet 
derivative Dg(x) of a function g: U —> R is a bounded linear functional on U for fixed x S 
77 and it can thus be identified by its gradient using the Riesz representation theorem, i.e., 
Dg(x) ■ (f> — (Dg{x),<j>). In the same way the second derivative enjoys a representation as a 
bounded linear operator by the identity D 2 g(x) ■ (0,-0) = (D 2 g{x)(f> 7 ip). We will use both 
representations and it will lead to no confusion. 

2.3. The functional analytic framework. We will now introduce the semigroup framework 
on which our analysis of equations (|1.2j) and (|1.5j) relies. Recall from Section 1 that A = —A 



with dom(A) = H 2 (T>) n 77q(2?) and 77 = L<i(T)) with V C R d a convex polygonal domain. 
We denote || • || = || • \\h and (-, •) = (•,•)#. The operator A is closed, selfadjoint and positive 
definite. 

There is an orthonormal cigenbasis (</5,)jgN C H with corresponding eigenvalues < Ai < 
A2 < • • • < Aj — > 00, as i — > 00, for which A<pi = i € N. The asymptotics Ai ~ i 2 / d , as 

i — > 00, is well known. When the space dimension d = 1, as in Assumption B, we have 

(2.9) Tv(A-h) = \\A-^\\ Cl = \\A~h\\ 2 C2 < 00, V 7 >1, if d = 1. 

This means that (3 € (0,1) under Assumption B. 
We define norms of fractional orders by 

\M& = \\AU\ = (Y,*?( v >f>*rf> ^ eR - 

The spaces H@ are then, for f3 > 0, defined as dom(A^) and for /3 < as the closure of 77 with 
respect to the TT^-norm. The space 77 -7 of negative order can be identified with the dual space 
of 77T. Clearly 77° = 77, and it is also well known that 77 1 = H£(D) and 77 2 = H 2 {V) n H^(V), 
see [22 Ch. 3]. 

Let (Sh)he(o,i) denote a family of standard finite element spaces of continuous piecewise 
linear functions corresponding to a quasi-uniform family of triangulations, for which h denotes 
the largest diameter in the triangulation. Then Sh C 77 1 . By Ph we denote the orthogonal 
projector of 77 onto Sh- Let Ah : Sh — > Sh be the unique operator satisfying 

(A h i/>,x) = (VV-.Vx), WsxeSft. 
This is the discrete Laplacian. By definition 

(2.10) \\Al(fh\\ = ||Vp h || = \\A?cp h \\ = \\<p h \\ Al , (p h e s h . 

Therefore, Ph can be extended to 77 _1 , so that for all tp £ 77 _1 , 

(2.H) \\A- h ipM = sup = sup < sup = \\A-h\l 

Moreover, 

(2.12) \\AlP h (p\\ < C\\Ai<p\\, ip £ 77 1 , uniformly in h. 

This follows from (|2.I0j) and the well-known fact that Ph is bounded with respect to || • ||^i = 
\\Ai ■ ||, when we use a quasi- uniform mesh family. Interpolation between this and (|2.1I[) yields 

(2-13) \\AlP h <p\\ < C\\AM, ^ e H\ 7 G [~h §]■ 
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Furthermore, (|2. 12|) means that | PnA~ 2 1| £ < 00. Hence, 

\\A-hAlP h \\ c = \\{A-?AlP h )*\\c = \\AlP h A---\\ c < C, 

! _ 1 

so that 1)^2^ 2 Phf\\ < C\\f\\ or 

\\A-i<p h \\<C\\Ak*<p h \\, tp h €S h . 
Interpolating between this and (|2.10[) yields 

\\A^ h \\<C\\Al Vh \\, (ph G Sh, 7 6 hi, 3]- 
Using also (|2 . 13[) yields the norm equivalence 

(2.14) cUltpnW < \\Ai<p h \\ < C\\AZ<p h \\, y h G S h , 7 6 h|, §]■ 

The interpolations above are valid since (ij ,3 ) ) g e [_ 1 ji and are real interpolation 

• 8 ■ - 

spaces, where H£ = Sh with norm ||w/i||/,p< = For positive order this is standard, see 

h 

for instance [50]. For negative order, let f3 G [0, 1] and notice that 

[H°, H-%,2 = [(H )*, (H 1 )*}^ = [H°, H\ 2 = {H?y = . 

We define the Ritz projector Rh : H 1 — > Sh to be the orthogonal projection with respect to 
the 7J 1 -scalar product. Since T> is convex and polygonal it is well known that 

(2.15) \\A^(I ~ R h )A-i\\c<Ch r - s 1 < s < 1 < r < 2. 
For Ph the following error estimate holds 

(2.16) \\A^{I- P h )A-5\\ c < Ch r -\ 0<s<l, < s < r < 2. 

For more about the finite element method, see [2] for elliptic equations and [22] for parabolic. 

Denote by Nh the dimension of Sh- There is an orthonormal cigenbasis (^j LJ\ C Sh cor- 
responding to Ah with eigenvalues < \'{ < \\ < ■ • • < X% ■ The operators —A and —Ah 
generate analytic semigroups (E(t)) t >o and (Eh(t))t>o, respectively They are spectrally given 

by 

(2.17) E{t)v = ^e- Xit {v, Vi )ipi, veH,t>0, 
and 

N h 

E h (t)v h = e-^ivh^iWt v h G S h , t > 0. 
t=i 

The semigroup (Eh(t))t>o solves the parabolic equation Uh+AhUh = 0, t > 0, with Uh{0) = PhU, 
in the sense that Uh{t) = Eh(t)PhV. 

Important for our analysis is the estimate 

(2.18) \\A^E(t)\\c + \\AlE h (t)P h \\ c <C^, 7 > 0, t > 0, uniformly in h. 

It is standard and is enjoyed by all analytic semigroups. 

Let P m denote the spectral projection onto the space spanned by the m first eigenvectors 
((/?i)™ 1 of A. An easy calculation shows that 

(2.19) \\(I-P m )A- r \\ c < A m '\ r>0. 

In our analysis we will use the notation a < b, to mean that there exists a constant C > 
such that a < Cb. The constant will never depend on the mesh size h. 
We will frequently use the following Gronwall lemma: 

Lemma 2.1 (Generalized Gronwall lemma). Let ip(t) > be a continuous function on [0,T]. 
If, for some A, B > and a, (3 G [0, 1), the inequality 

tp(t)<At- a + B I (t- s)-P(p(s)ds 
Jo 
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holds, then there is C = C(B,T,a, /3) such that 

Lp(t)<CAt- a , ie(0,T]. 

2.4. The stochastic integral and Malliavin calculus. Since we use the Malliavin calculus 
in the proof of our main result, we outline a framework for the stochastic integral in which this 
calculus has a natural role. This is an alternative to the more classical procedure, presented 
in [3]. Our presentation of the Wiener integral relies on [23], and the Malliavin calculus on [8] 
and [18] , where a natural extension of the framework of [21] to Hilbert space valued stochastic 
integrals using tensor products is presented. 

The covariance operator Q G C(H) is self adjoint and positive scmidcfinite. Let Q 1 / 2 denote 
the unique positive square root. Let Q -1 ' 2 be its inverse, restricted to (kerQ)^. Define the 
Hilbert space Uo = Q 1 / 2 (if), equipped with the scalar product (u,v)u = (Q~ 1 / 2 u,Q~ 1 / 2 v). If 
Tr(Q) < oo, then the triple i : Uo H is an abstract Wiener space, where i is the inclusion 
mapping i : x <— > x. This triple induces a Gaussian probability measure on H with mean 
and covariance Q. It is referred to as an abstract Wiener measure. The space Uq is called the 
Cameron-Martin space in this context. 

Let I: L2([0,T],Uo) — > L2(Q) be an isonormal process: For every <fi e -^([O, T], Uo) the 
random variable I(4>) is centered Gaussian and I has the covariance structure 

E[/(0)/(VO] - (&V>z,([o,r|,Db). M e MO,T],[/ ). 
The existence of / follows by an application of the Kolmogorov Extension Theorem. 
Define, for u £ Uq, the cylindrical Q- Wiener process W: [0,T] x U Q —> L 2 (Q) by 

W(t)u = I( X[ o,t)®u), ue U , t G [0,T]. 

For u E Uo the process W(t)u, t G [0, T] is a Brownian motion and given u,v £ Uo 

E,[W{t)uW{s)v] = mm(s,t)(u,v) Uo . 

The space of Hilbert-Schmidt operators £° = £2(^0, H) can be identified with H ®Uq, and 
h^uG^ for h £ H, u £ Uo being the operator (/i ® w)v = (u, v)u h, v £ Uo- 

We now define the if-valued Wiener integral for the simplest possible integrands. Let $ = 
X[ a ,b] ®{h®u) £ L 2 ([0,T],£§), for a, b G [0,T], h £ H and u G E/ - Then the Wiener integral 
of $ is defined as the iJ- valued random variable 

rT 

$(«) dW(«) = 7(x[a,i] ® «) ® ft = (W(6)« - W(a)u) (8 fc G L 2 (fi, H). 

It is not difficult to show that for such integrands the following property, known as Wiener's 
isometry, holds: 



E 



T 



2 - 




H. 


-1 



T 



$(t)dW(t) = / \Mt)\\ 2 c°dt. 



The integral extends directly to linear combinations of such integrands by linearity of I. By the 
Wiener isometry, completeness of L,2([0, T], £ 2 ) an d classical approximation results for L 2 ([0, T])- 
functions and for compact operators, it extends to all $ G 7 2 ([0, T), £ 2 ). 

Let C£°(R") denote the space of all C°°-functions over R™ with polynomial growth. We define 
the family of smooth cylindrical random variables 

S = {X = f(I( ( f )1 ),...,I(0 N )):feC™(R N ), ^ £ L 2 ([0,T),U ), N > 1} 

and the corresponding family with values in H as 

M 

S{H) = {F = J2 X i ® h i : X i> ■ ■ ■ > X m e 5, hx, . . . , h M G H, M > l}. 

i=l 

The Malliavin derivative of a random variable in S with representation X = /(7(0i), . . . , I(<j)pf)) 
is defined as the 7 2 ([0, T], f/ )- valued random variable = J^ i=1 dif(I{<j)\), . . . , 7(0jv)) ® 
Clearly this is a /To-valued stochastic process. Wc write Z? t X = ^ i=1 dif(I((j>x), . . . , J(0jv)) ® 
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4>i{t) for t G [0,T]. The Malliavin derivative of a random variable F e S{H) with the represen- 
tation F = J2i=i fi(I(<l>i), ■■ ■ , I{<I>n)) ® hi is given by 

M N 

D t F = £ . . • ,i(M) ® ® MO)- 



1=1 j=l 



Thus (-Dt-F')tG[o,T] is an £ 2 -valued stochastic process. By D™F we denote the derivative of F in 
the direction u G Uq at time t, i.e., -D"-F = D t Fu, where 



M N 



i=l j=\ 



J{<t>N)) ® ftf 



At the very heart of Malliavin calculus is the following integration by parts formula. It tells 
that, for all F G S(H) and $ G L 2 ([0, T], £§), 



(2.20) 



E(DF, $) L2(M , £ o ) = E(F, / *(t) dW(t) 



Thus the Wiener integral is the adjoint of D: 5(H) C L 2 (fl,H) -> Z, 2 (fi x [0,T],£§) for de- 
terministic integrands. Formula (|2.20[) follows from the corresponding formula for real-valued 
smooth stochastic variables. The derivative operator D is known to be closable. We define the 
Watanabe Sobolev space D 1,2 (i?) as the closure of S(H) with respect to the norm 

r-T 



\F\\^, {H) = (E[||F|||] +e[^ \\D t F\\ 2 c% dt 



Denote by dom(<5) the elements $ G L 2 {Q x [0,T},£ 2 ) for which E[(DF,$) L 



2 ([0,T],£°). 



de- 



fines a bounded linear functional for F G D 1:2 (i7). For any such $ the functional l<s>(F) = 
E[(£)F, $)x, 2 ([o,t],£§)] can be extended by continuity to all F G L 2 (fl,H). The Riesz repre- 
sentation theorem guarantees the existence of an adjoint operator to D, namely 5: dom(<5) C 
L 2 {tt x [0,T],£§) ->• L 2 {VL,H) that satisfies 

(2.21) E[(DJT, $> i2([0 ,T],£0)] = E[(F, VF G D 1 ' 2 ^)- 

This is a natural extension of (|2.20j) to a much larger class of integrands. In [5J Lemme 2.10] it 
is proved that for any predictable process $ G L 2 {Q, x [0, T],/^) the action of (5 on $ coincides 
with that of the ltd integral, i.e., 

= / $(t)dW(t). 
Jo 

Instead of relying on Ito theory we take this as the definition of the Ito integral. We remark 
that dom((5) contains processes that are not predictable and thus 5 is an extension of the Ito 
integral to such integrands. In this context 5 is called the Skorohod integral. 

The following lemma [5j Lemma 2.1] has a central role in the proof of our main result. 

Lemma 2.2. For any random variable F G D 1,2 (iJ) and any predictable process $ G L 2 ([0,T] x 
f2, £2) following integration by parts formula is valid. 



E 



$(t) dW(t),F) 



= E 









m),D t F) c o dt 



Proof. This is just a restatement of (|2.21[) for $ predictable. 



□ 



A corollary of Lemma 12.21 is the Ito isomctry. It reads 

rT 2-1 r- rT 

$(i) dW{t) 

H 



(2.22) E 



E 



$(t)|| 2 o dt , V$ G L 2 {[0,T] x n,£§), predictable. 



The Malliavin derivative acts on its adjoint by -D"<5($) = + $(s)w or in terms of the 

Ito integral fi(X[o,t]$) = / ' $ W dV^(r) with a predictable $ G L 2 ([0,T] x f2, C°) satisfying 
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G D 1,2 (£§) for all t G [0,T]: 

(2.23) / *(r)dW(r)=/ £>"$(r) dW(r) + #(«)«, < s < t < T. 
Jo Jo 

If s > i, then D" J Q dW(»~) = 0, since the mtegral is Tt -measurable. The class of 
F G D 1,2 (iJ) that arc J^-measurablc coincides with the class of constant deterministic ran- 
dom variables. Let V be another separable real Hilbert space and a G C^(H,V). Then 
a(F) G D 1,2 (V) and 

(2.24) Dt(a(F)) = Da(F) • D™F, u e U , F G D 1 * 2 ^), 

(2.25) D t (a{F)) = Da(F)D t F, F G D 1 ' 2 ^). 



2.5. Existence and uniqueness. Existence and uniqueness of a solution to (jl.2|) . under As- 
sumption A with /3 = 1, is stated as [3] Theorem 7.4]. This is the case when Tr(Q) < oo. The 
extension to (3 G 1) is straight-forward. For Assumption B existence and uniqueness is given 
as [3] Theorem 7.6]. By using the methods of [12] and [17] one can show that the regularity in 
space is of order (3, i.e., the solution X is of the form [0,T] x ft — > H", P-a.s.. Recall here that 
(3 is some number (3 G [|, 1] under Assumption A and any number /3 G (0, 5) under Assumption 
B. The family {Xh)he(o,i) °f solution processes of the discrete equation (|1.5p . corresponding to 
the family of triangulations, is treated analogously and clearly Xh(t) G Sh C H 1 , P-a.s.. The 
estimate E\\Al X h (t)\\ 2 < (7(1 + ||X || 2 ), 7 G [0,1] is uniform in h, only for 7 G [0,/3]. The 
strong convergence 

(2.26) (E\\X(T)-X h (T)\\ 2 y <ChP , 

is proved in |16j under the assumption of trace class noise. The proof is easier under Assumptions 
A and B. We formulate a qualitative bound for the solution processes in the following theorem. 

Theorem 2.3. Under either Assumption A or Assumption B there exists unique predictable 
solutions X G C([0,T],L 2 (n,H)) and X h G C([0, T], L 2 (0, S h )) to equation £ZHP and (jl~5]) 
respectively. We refer to these solutions as the unique mild solutions of hl.l)) and f| 1 . 5[> . There 
exists a constant C , such that the following moment estimate holds 

(2.27) sup E||A(t)|| 2 + sup E||X h (i)|| 2 < C(l + ||X |[ 2 ). 
te[o,T] te[o,r] 

3. Estimates of the Malliavin derivative of the solution 

We consider the Malliavin derivative of the discrete solution process and prove some estimates 
needed later. Differentiating the equation (|1.5[) formally in direction u G Uq, using (|2.23l) . (|2.24l) . 
and the fact that we have a deterministic initial value, yields 

D u s X h {t) = E h (t - s)P h g(X h (s))u + f E h (t - r)P h Df(X h (r)) ■ D u s X h {r) dr 

(3-1) t 

+ J E h {t - r)P h {Dg{X h {r)) ■ D u s X h (r)) dW(r), < s < t < T. 

This equation is treated much like (|1.5[) itself. It has a unique solution. 

Before we proceed to the estimate of the Malliavin derivative we notice that, by the linear 
growth of / and g, implied by their bounded first derivative, and the moment estimate (|2.27[) 
for X and Xh yields 

(32 ) sup E||/(F(t))|| 2 + sup E||^ 5 (F(t))|| 2 < 1 + ||X || 2 , Y = X or X h . 

te[o,T] ie[o,T] 2 

Lemma 3.1. Consider equation hi. 5}) under Assumption A. Then the Malliavin derivative of 
Xh, given as the solution D s Xh to equation h3.1\) . satisfies for some constant C = C(T) > 
the bound: 

E[\\A^ D a X H (t)\\lp] <C, < a < t < T. 
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Proof. We make use of equation (|3.1|) with g(x) = /, Dg(x) = 0, for the proof and recall that 
/3 — 1 € [— i, 0]. Fix u e C/o- Thanks to the Cauchy-Schwarz inequality we get that 

E\\D"X h (t)\\ 2 < \\E h (t - s)A^f A^f P h u\\ 2 + J E\\E h (t-r)P h Df(X h (r)) ■ D u s X h {r)\\ 2 dr. 
In view of (|2 . 13[) and the boundedness of Df and Eh{t) we have 

(3.3) E||£>^(*)|| 2 < \\A^E h (t - S )P h \\l U^u\\ 2 + f |/|^ E\\D*X h (r)\\ 2 dr. 

J S 

The analyticity of the semigroup (|2.18p yields 

E\\D"X h (t)\\ 2 < (t - sf-^A^uf + J* E||D^(r)|| 2 dr 

and applying Gronwall's Lemma r2.il for fixed s E [0,t), gives 

(3.4) E||Z^(i)|| 2 < (t - sf-^A^uf. 
Proceeding as in the proof of (|3.3[) . we obtain also 

E\\A^D^X h (t)\\ 2 < \\A^u\\ 2 + J\\\D-X h (r)\\ 2 dr. 

Estimate (|3.4[) is applicable. Thus 

/ E||-D?Jr fc (r)|| 2 dr< / {r - sf- 1 dr \\A^ u\\ 2 < [t - sfWA^uf , 

J S J S 

and hence 

(3.5) nA^D-x^r^u^uw 2 . 

Notice that this is uniform with respect to u e Uq. We take an ON-basis (itj)ieN C Uq and 
compute the /^-norm according to (|2.4I) . Using Tonclli's Theorem and (|3.5[) we get that 

nA^D B X h (t)\\£o = E£ II^^^WII 2 = ^EU^D"^)!! 2 
<£||A^|| a = |l^% 

□ 

For the white noise case we will need the following lemma that is a space discrete analogue 
of [5J Lemma 4.3]. Recall that in this case Q = I, Uq = H, = £2 

Lemma 3.2. Consider equation il.5\) under Assumption B. Then, for 7 g [0, i), i/ie Malliavin 
derivative satisfies the following estimate: 

V\\AfD s X h (t)\\ 2 c < C(t - s)~\ < s < t < T. 

Proof. Let u £ H , and take norms in (|3.1|1 using the Cauchy-Schwarz inequality and the Ito 
isometry (|2.22j) to get 

E\\AlD^X h (t)\\ 2 < E\\A*E h (t - s)P h g{X h ( s ))u\\ 2 

+ E\\AlE h {t - s)P h Df(X h (s)) ■ D u s X h {r)\\ 2 dr 

+ f n4 +t AlE h {t - s)A-*- e P h Dg(X h (s)) ■ D u s X h {r)\\l 2 dr. 
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For e > small enough we have by (|2.7j) and (|2.18p 

E\\AtD?X h (t)f<(t-s)-^ sup V\\g(X h (s))\\ 2 c \\u\\ 2 

se[o,r] 



/(*- r )" 7 l/I^E||4^^WI| 2 dr 
^(*-r)- 7 -i- 2e ||A7- e P fe ||^ |.g|^ E||4^X,(r)|| 2 dr. 



By (EU), and (j2~T3l) we have 

IK^Ik < ||A*^P^l +e |kp-4- £ |k 2 
and by Gronwall's Lemma |2~T1 and (|3.2j) 

EUA^^Wf < (t - S )-^(l + ||X || 2 )|| U || 2 . 
4. Regularity results for the Kolmogorov equation 



□ 



In [5] , Q] and [23] , in the case of discretization in time, the proofs of the weak convergence is 
performed for finite-dimensional spectral Galcrkin approximations. The use of the Ito formula 
and the Kolmogorov equation is in this way justified. The estimates are uniform in the dimension 
m G N of the approximation space and they thus hold in the limit. The approximation is not 
made explicit in the proof. For the discretization in space some more care need to be taken. 
This is due to the fact that the operators A and Ah do not commute. 

Recall that P m is the projection onto the subspace of H m C H spanned by the first to € N 
eigenvectors (</?i)^i °f A. Let A m = P m AP m = AP m = P m A. By (E m (t)) t >o we denote the 
semigroup generated by —A m , i.e., it is given by the to first terms in the spectral representation 
(|2T7| of (£(*)) 

We denote by X^ the solution of equation 

X^{t) = E m (t)P m x+ f E m {t-s)P m f(X^(s))ds+ f E m {t-s)P m g(X^(s))dW(s), te [0,T]. 
Jo Jo 

Define the function u m (t,x) = E[G(JT„(£))] for t E [0,T] and x e H. Note that u(t,P m x) = 
u(t,x) for x € H . It is well known, see e.g. [2 Theorem 9.16], that u m : [0, T] x H — > R is a 
solution to the Kolmogorov equation 

u m (t, x) + L m u m (t, x) = 0, (t, x) e (0, T] x H, 
u m (0,x) = G(P m x), x € H, 

where the Markov generator L m is given by 

(L m v)(x) = (A m x-P n f(x),Dv(x)) - ^Tr (P m g{x)Qg*{x)P m D 2 v{x)) , x E H. 

The proof of Theorem 1 1 . 1 1 relies heavily on estimates of the derivatives Du m and D 2 u m of u m 
of the form: for some a > we have 

(4.1) sup \\A x Du m (t,x) || <Ct- x \G\ c i, t E (0,T], A 6 [0,a), 

(4.2) sup \\A x D 2 u m {t,x)AP\\ c <Ct-^\G\ ch t e (0,T], A,pe[0,a), A + p<l. 
xeH b 

In the case of colored noise it turns out that we need a > (1 + /3)/2 to obtain convergence 
of the right rate. So far, to our knowledge, there is no satisfactory result in this direction for 
multiplicative noise. But for additive colored noise, case A, the situation is much easier and 
the estimates hold for a = 1, see Lemma 3.3 in [24]. For the white noise case the estimates are 
stated as Lemma 4.4 and Lemma 4.5 in [5] with a = \. Thus, in case A we have (3 E [\, 1], 
P~T]) and P~2"j) with a = 1, and in case B we have (3 E [0, \) and P~Tj) and P~2j) with a = i. 
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Since we have the operator A in (|4.1[) and (|4.2p instead of the more natural choice A m we 
outline their proofs. We will use that Du(t, x) ■ <j> = E[DG(X^(t)) ■ rj^i x (t)}, where 



4f{t)=E m (t)P m </> + / E m (t-8)P m Df(X^(s))-4l'(s)d8 
Jo 

E m (t - S )P m (Dg(X x m {s)) ■ vtfis)) dW(s). 



In the proofs of Lemma 3.3 in [24] for case A with a = 1 and Lemma 4.4 in [5] for the case B 
with a = o it is proved that 

(4.3) (supEH^fV <r x ||A- A P m ^||, t 6(0,1*], Ae[0,a). 

Therefore 

{A x Du m {t,x),i>) = {Du m (t,x),A\) = E[DG(X^(t))-r,^(t)} < |G| c i(E||r^>*(i)|| 2 )^ 
< \G\ cl t- x \\A m x P m A^\\ = \G\ ci t- x \\P m ^\\ < {GUt^M, 



implying (|4. 1[) . 

For (|4.2p we notice that 

(4.4) D 2 u m (t,x) ■ M) = E[D 2 G(X^(t)) ■ {t,%>(t), (*)) + DG{X^(t)) ■ 

where 



Jo 

+ f E m (t s)P m {D 2 g{X^{s)) • (Vt x (s),vt: x (s)) + Dg(X^(s)) ■ dW(s). 



In the proof of Lemma 3.3 in [21] for case A with a = 1 and Lemma 4.5 in [3] for the case B 
with a = \ it is shown that 

(4.5) ( sup supEUC^WII 2 )" < \\A-jP m 4>\\\\K^P m MV A,pe [0,a), A + p < 1. 
v te[o,T] lea 7 

Since D 2 u m ■ ((f>,ip) = (D 2 u m (j),tf}) and by (|4.4p and the Cauchy-Schwarz inequality 
{^DVn&xM"^) = (£) 2 u m (t,a:)A^,A A ^) 

= E[D 2 G(X^(t)) ■ (r£*>*(t) t r£**(t)) + DG(XZ(t)) ■ C^' A '*' B (*)] 
< ^^(Ell^^CfJfJ^EII^-WH 3 )' + |G| c i(E||C^ P ^(*)ll 2 )' 

Applying (|4.3p and (g3J) yields 

(A x D 2 u m (t, x)A p 4>, ip) < (\G\ c? r x -P + \G\ cl )\\A m x P m A x m\A m Pp m A^\\ 

< t- x -»u\m\. 

Thus (JHU is valid. 

5. Proof of Theorem II. II 

The error will split into several terms, some of which are common to Assumptions A and 
B and some are not. We will first present the proof under Assumption A. When doing so we 
write it as if the noise were multiplicative, i.e., with the operator g included. This will case the 
presentation of the white noise case B. 
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5.1. The case of colored noise. For an J^-measurable, H m -vahied random variable £, the 
law of iterated expectation and Proposition 1.12 in [3] yields 

(5.1) E[G(0] = E\E[G(t)\Fr]] = E[E[G(X*(0))|.F T ]] = EK(0,{)]. 

Thus, the weak error splits into four terms: 

E[G(X(T))-G(X h (T))} 

= E[G(X(T)) - G(X m (T))] + E[G(X m (t)) - G(P m X h (T))} + E[G(P m X h (T)) - G(X h (T))] 

= E[G(X(T)) - G(X m (T))}+u m (T,X )-u m (T,X h {0)) 
+ E[u m (T,X h (0)) - u m (0,X h (T))}+E[G(P m X h (T)) - G(X h (T))} 

= e?(T) + eZ(T) + e?(T) + e?(T). 

Our intention is to let m — > oo and see that the remaining terms is of the right order. The 
first one is easy to treat since when we let m — S- oo the term e™(T) — > by the low order of 
weak convergence implied by the strong convergence. The second term e™(T) is still easy but 
needs computations. These holds under both of our assumptions. Using the Cauchy-Schwarz 
inequality, estimate (|4.1[) with < X = f3 — e < a = 1, and the error estimate (|2.16[) . we obtain 
for small e > 

f 1 d 

e?(T) = u m (T, X ) - u m (T, P h X ) = / — u m (T, P h X + X(I - P h )X ) dX 

Jo d ^ 

= J (A^Du m (T,P h X a + \(I-P h )X ),P m A-P +e (I-P h )X Q )d\ 

< f \\A^- e Du m {T,P h X Q + X(I - P h )X )\\4P m \\ c \\A e -P(I - P h )\\c\\X Q \\ d\ 



< h^- 2e T- ff \G\ cl \\X Q \\ < h 2f3 ~ 2e , uniformly in m. 
Here we used that A and P m commute and that 

\\A^(I - P h )\\ c <\\(A^(I - P h )y\\ c < \\(I-P h )A*-e\\ c . 

Notice here that we could have got a sharp result with e = under Assumption A, in the 
case /? < 1. However, e2(T) does not allow a sharp rate. 

We now turn to the third error term e™(T). For this we need the Markov generator Lh of 
the finite element solution Xh- It is given by 

(L h v)(x) = (A h x-P h f(x),Dv(x)) - ^Tr (P h g(x)Qg*(x)P h D 2 v(x)), x € S h . 
Ito's formula and the Kolmogorov equation gives that 

e™(T) = -E[u m (T-t,X h (t))-u m (T-0,X h (0))] 



t=T 



E 



E 



AT - t, X h (t)) + L h u m (T - t, X h (t)) At 



(L m -L h )u m (T-t,X h (t))dt. 



The error e™(T) now naturally divides into three terms: 

\e?(T)\ < |E { ((A m -A h )X h (t),Du m (T-t,X h (t)))dt 



+ 



E 



(P m - P h )f(X h (t)),Du m (T - t,X h (t)))dt 



1e 

2 



Tr 



{ [p m g(X h (t))Qg*(X h (t))P m - P h g(X h (t))Qg*(X h (t))P h 



x D 2 u m (T -t,X h (t))Xdt 
I + J + K. 
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The Ritz projection Rh can be expressed in the form Rh = A h x PhA. Observing this we can 
write 

({A m - A h )X h , Du m ) = ({A m P h - P h A h )X h , Du m ) = (X h , (P h A m - A h P h )Du m ) 

= {X h ,A h P h (A^P h A m - I)Du m ) = (X h ,A h P h (A^P h AP m - I)Du m ) 

= (X h ,A h P h (R h - I)P m Du m ) + (X h ,A h P h (P m - I)Du m ). 

This enables us to rewrite the term I so that we can apply the error estimates (|2.15[) and 
(I2.19[) for Rh and P m respectively. We substitute for Xh the mild equation (|1.2[> and treat the 
terms separately and estimate 

fT 



I < 



E 



E 



E 



o x 

T , P t 



J (E h {t)PhX ,A h Ph{Rh - I)P m Du m (T - t,X h (t))) dt 

E h (t - s)P h f{X h (s)) ds, A h P h (R h - I)P m Du m {T - t, X h {t))) dt 
E h (t - s)P h g(X h (s)) dW(s),A h P h (R h - I)P m Du m (T - t, X h (t)) ) , 
J (AhX h , (P m - I)Du m (T - t, X h {t))) dt 



1 JO 
T 



= l[ l + 1$ + l'i + V 



We now estimate l\. Let e > be small. Using (|2~T51) . ([2TT5]) , ([235]) . and (j¥TT|) yields 



E 



< E 



J (A h ^Eh(t)P h X 0: A h Ph(Rh - I)A-' 3+t P m A^Du m {T - t,X h (t)j) dt 

\\A h ^E h (t)P h \\dX \\\\AlP h (R h -I)A-^dPrnU 
sup \\A p ~ e Du m (T -t,x)\\dt 



xEH 



< h 2 P~ Ae / t- 1+e (T - i) _/3+e dt \G\ c i \\X \\ < h 2f} - 



4c 



The term 1% is easily estimated as follows: 



rh _ 
L i — 



< 



E/ ( / A h - e E h {t-s)Phf{X h {s))ds, 
Jo x Jo 

A%P h (Rh - I)A-^P m A^Du m (T - t,X h (t))) dt 

rt \\AjrEh(t- S )Ph\\ c (nf(Xh( S ))\\ 2 ^ 

x \\A h Ph(Rh- I)A- l)+f -\\ c \\P m \\c Bup||^- e Du ro (T-t,s)||dsdt. 

xeH 

Using ([2TT5]) . ([2~T5]l . (gH5]| and Q£2) yields 

jh < h W-4* f T f{T-t)-^(t- S )- l+t dsdt<h 2 ^. 
Jo Jo 

For 73 we use the Malliavin integration by parts formula from Lemma 12.21 together with the 
chain rule (|2.25|) to obtain the error representation 



*3 



E 



E 



( Jo 

T r t 



JO 



E h (t - s)P h g{X h {s)) dW(s), A h P h (R h - I)P m Du m (T - t,X h (t))) dt 
Eh{t-s)P h g(X h {s)), 



A h P h (R h - I)P m D 2 u m {T - t,X h (t))P m D s X h (t)) n dsdt 
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Here we treat Assumptions A and B separately and start with A; B is postponed to the next 
subsection. Distributing powers of A and Ah carefully and setting g(x) = I, we write 



(E h P h ,A h P h {R h - I)P m D 2 u m P m D s X h ) c% = (A^ e E h A^A^fp h 



A— +e P h (R h - I)A-^ +t P m A ± ^-<D 2 u m A ± ^P m A^ ± D s X h ) c0 . 



Using the Cauchy-Schwarz inequality for £° an d (j2.7[) yields 



J*<E f f\\A\-*E h {t- s)P h \\ c \\Ap -p h \\ c0 JA/ +e P h (R h -I)A-^\\ c \\P m \\c 
Jo Jo 

x sup \\A^~ € D 2 u m {T - t,x)A^\\ c \\A^ 1 D s X h (t)\\ c o dsdt. 

xEH 2 



3-1 



We use (|2.13|) to get \\A h 2 Ph\\ c ° < 11^^ \\c°- Tne norm equivalence (|2.14|) and the fact that 
DgXh(t) e Sh, P-a.s., for every u G U yields 

\\A^D s X h (t)\\ c o < \\A^f D s X h (t)\\ c o. 

The analyticity of the semigroup (|2.18|) . the error estimate (|2.15p together with (|2.13|) . the 
gradient estimate (|4.2[) . Tonelli's theorem and the Cauchy-Schwarz inequality now imply that 

I* < h 2 ' 3 ~^\G\ cl \\A^ || £ „ f f (E\\A^f D s X h (t)\\ 2 c ^ (T t)-^(t - S )- 1+e ds dt, 

Jo Jo 

Applying Lemma 13. II we finally get 

Jo Jo 

Using ([27T9]) . ([4TT]). the Cauchy-Schwarz inequality and (f2T27|) yields 

I m <E [ \\A h X h (t)\\c\\(P m - I)A-i +e \\ sup \\A^Du m (T - t,x)\\ c dt 

J0 xEH 

I rT 

<X^ +£ \\A h P h \\ c ( sup E\\X h (t)\\ 2 Y / (T-i)-^di. 

V iG[0,Tl 7 JO 



Letting m — > oo for fixed /i yields i™ — > and linim^oo / < h 2 ^~ 4e . 

The term J is considered next. Writing P m — Ph = (P m — I) + (I — Ph) we get the natural 
decomposition J < J m + J h . Using the Cauchy-Schwarz inequality, the error estimate (|2.16l) . 
Iglljl , and (J3TU) yields for i e {h, m} 



J' 



((/ - P l )Du m (T - t, P m X h (t)), f(X h (t))) ds 

< [ T Wil-PAA-^U sup \\A^Du m {T -t lX )\\{nf{X h {t))\\ 2 ) h dt 

Jo x£H m 

< \\{i-PM- p+6 U\G\a [ T (T~ty^dt. 



We have 

jh < /l 2/9-2e ) and jm < x ~p+^ 

For if we write 

P m gQg*P m - P h gQg*P h 

= Ph.gQg*(l - Ph) + P h )gQg*P,n + (P m + Ph)gQg*(P m - 1), 
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and hence we get the following decomposition 



2K = 



< 



IV ([P m g(X h (t))Qg*{X h {t))P m - P h g(X h (t))Qg*(X h (t))P h ] 
x D 2 u m (T-t,P m X h (t))J dt 
E J Tr (p h g{X h {t))Qg*{X h {t))(I - P h )D 2 u m (T - t, P m X h (t))) dt 







E 












E 









= K[ l + K' 2 l + K m . 

Assumption A is treated first; B is postponed. By (|2.3| . (|2.2j) and (|2.7j) . we have 

Tr(P h Q{I - P h )D 2 u m ) 

= Tr{P h Q(I - P h )D 2 u m A 1 ^A^ 1 ) = Tr (A^ P h Q (I - P h )D 2 u m A^) 



j3 — 1 1-/3 /3-1 /3-1 1-/3 



= Tr (A^~ P h A^~ A^~ QA^~ A^~ (I - P h )A — s~+ e A^ — e D u m A 



1 + 



< \\A—p h A—\\c\\A—\\ 2 \\A—{I-P h )A-— +t \\ c \\A—- t D 2 u m A—\\c 



where we used the fact that 



A^QA^\\ Cl = Tr((A^Qi)(A^Q 1 *)*) = ||^^Q*||| 2 - UA^H 2 



By (EH and (233) || A^P^A 1 ^ || £ < || A^P, A^ || £ < ||A^A^|| £ = 1. Using 
(EUl) , and gU gives us 

< h 20-2e^ A ^ | G | cg / T (T _ t) -l+ e di < ^-2e 

w 

For we compute similarly 

Tr((J - P h )QP m D 2 u) = Tr(A-^+ e (7 - PJA^^^QA^ A^Z? 2 ^^^" 6 ) 

l+g , „ , . 1-/3 „ „ . g-1 ,,n „ , l->j „q , 1 + 



3-1 . l-ff 



< IIA-^+^J-POA^I^IIA^H 2 ||A^D 2 Mm A^- £ || £ , 



where 



||A-^+ e (/ - P fe )A^|| £ < ||(A-^+ e (7 - P^A^)*^ = \\A^(I - P,)A-^+ e || £ , 
so that (|2.16[) applies. Hence, 

At < /i^- 2e ||A^|| 2 o |G|cg /"V - <)- 1+e dt < ^- 2e . 

o 

Term K m is treated analogously as K\. We have A m < X n f +t . 

Finally, by the Lipschitz continuity of G, the Dominated Convergence Theorem and the strong 
convergence of P rn — >• I we get 

e?(T) < |G| c iE]|(P m - I)X h (t)\\ -> 0, as m ->• oo. 

We conclude that e(T) = 0(h 21 ) for any 7 < /3, which completes the proof in case A. 
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5.2. The case of white noise. Now consider the case of Assumption B. All estimates above, 
except those for I3 , K\, and K m , hold under B by setting Q = I and /3 = \ and recalling 
that Uq = H and £° = ^2- We now complete the proof with the remaining estimates. 
Using Holder's inequality (|2.8[) yields 



J (E h {t-s)P h g{X h {s)),A h P h (R h -I) 
x P m D 2 u m (T - i, P m X h (t))P m D s X h {t)\ ds dt 

Jo Jo 

x sup \\A^D 2 u m {T~Ux)A^\\ c \\A-^A-^P h \\ c M^D s X h {t)\\ c dsdt. 

First, using (|2. 13|) and (|2.9|) , we have 

\\A-^A^P h \\ Cl = \\A- h ^P h A-^\\ Cl < \\Af*P h A 2e \\c\\A-i-*\\ Cl < \\A-^\\ Cl . 
Now we apply ([235]) . IpTIS]) . fj472]) with p = A = \ - e < a = ±, to get 

Is <^" 8£ |G| C , / T /* (E|| ff (X fe ( S )) |||) * (E|| ^ (T - 1)"^ 2 ^* - s)-^ 36 d S d*. 
Jo Jo 

Finally using Lemma 13.21 together with (|3.2[) finishes the estimate of I3 . Indeed, 

I'i < h^lGlc^j (T -t)- 1+2e (t- s)- 1+c dsdt < h 1 -^. 

For Ki we use Holder's inequality (|2.6j) . (|2.16[) . and Lemma [4T2l to get 

2K,< [ E\\A-^P h g(X h (t))g*(X h (t))A- e \\ Cl \\A^I - P h )A-^\\ c 
Jo 



x sup \\A~ D 2 u m (T- t,x)A— \\ c dt 



<h 1 -^ sup E||A-^^ 5 (X fc (t)) 5 '(Jr h (t))|| £3/(a _ !h) ||A- e || £!l/ ,.|G| c g f{T-t)-^dt 
te[o,r] Jo 

J^ 1 " 3 ' sup EHs^WJIlillA-^IU 2A2 _ 3e) ||^- e |U 2/3e |G| c =. 
te[o,x] 



We compute and use (|2.9[) to conclude 

H^C = E(V)- E A - " Tr(^-l) < 00 
ieN ieN 

ii^CSS - E(^)^ = E*r™ = * 0- (s) ) < 00. 

The terms an d K m admits the same treatment, so that 

We have e(T) = 0(/i 2 T) for any 7 < ±. 
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